Sampling Rare Events in Non-Equilibrium and Non-Stationary Systems 
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Although many computational methods for rare event sampling exist, this type of calculation is not 
usually practical for general nonequilibrium conditions, with macroscopically irreversible dynamics 
and away from both stationary and metastable states. A novel method for calculating the time-series 
of the probability of a rare event is presented which is designed for these conditions. The method 
is validated for the cases of the Glauber-Ising model under time- varying shear flow, the Kawasaki- 
Ising model after a quench into the region between nucleation dominated and spinodal decomposition 
dominated phase change dynamics, and the parallel open asymmetric exclusion process (p-o ASEP). 
The method requires a subdivision of the phase space of the system: it is benchmarked and found 
to scale well for increasingly fine subdivisions, meaning that it can be applied without detailed 
foreknowledge of the physically important reaction pathways. 



I. INTRODUCTION 

Events which are highly improbable often have great 
importance to the behaviour of a system. The classic 
example of this is nucleation of a raindrop from super- 
saturated water vapour. Because droplets smaller than a 
critical radius are energetically unfavourable, the forma- 
tion of a super-critical droplet is unlikely to occur on the 
timescale of thermal motion of water molecules. Hence a 
straightforward computer simulation would waste much 
CPU time on unimportant fluctuations before producing 
an event of interest. 



A. The Rare Event Literature 

A large number of approaches have been developed to 
solve so-called "rare event" problems. Many of these ap- 
proaches are based on Transition State Theory [l|, 
i.e. on the concept of a quasi-equilibrium free energy 
landscape and of a particular "slow" motion of the sys- 
tem within this landscape. The landscape is imagined 
as consisting of basins linked together by 'Transition 
Paths' passing over saddle points. This landscape can 
be mapped using equilbrium methods including e^. um- 
brella sampling Q, multi-canonical sampling and 
Wang-Landau sampling Q ; and the motion across saddle 
points, once they are identified, can then be simulated di- 
rectly by initialising molecular dynamics simulations near 
to the saddle or even just reconstructed from the poten- 
tial of mean force Q . This approach of proceeding from a 
free energy surface to an understanding of the kinetics is 
principled and highly attractive but is limited to systems 
for which such a surface can be meaningfully defined and 
practically computed. 

For systems away from equilibrium the concept of free 
energy becomes problematic, although theories which use 
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analogues or extensions of this idea are rapidly being de- 
veloped; for example by considering the transition prob- 
abilities between states Q rather than directly assigning 
a free energy and the associated Boltzmann probability 
distribution directly to states themselves. 

In order to numerically study rare events under con- 
ditions where the concept of a free energy landscape is 
problematic. Transition Path Samphng (TPS) 3, For- 
ward Flux Sampling (FFS) ^B, llfl], Weighted Ensemble 
(WE) [ll| and a suite of related methods have been de- 
veloped. The basic idea of these methods is to selec- 
tively sample from the set of pathways which the system 
can take, by increasing the number of pathways in the 
important regions of the state space of the system, but 
compensating by attaching a variable statistical weight 
to each path. This group of methods is aimed at steady 
state non-equilibrium systems, and except for one very 
recent paper on WE [13] (published after this work was 
substantially complete) the potential for adapting or re- 
formulating them to give a time-dependent description 
of non-stationary dynamics has not yet been explored. 
Many processes (such as quenching, aging, ignition and 
impact) are naturally framed in a strictly non-steady 
or time-dependent way: beyond the equilibrium/quasi- 
equilibrium and also the stationary nonequilibrium treat- 
ments. The development of non-stationary rare event 
methods is therefore of potentially great importance. 



B. Phase-Space Binning and Reweighting 

The starting point for this work is a phase space bin- 
ning according to some macroscopic coordinate A (which 
is often called the "reaction coordinate" although it is not 
usually the true reaction coordinate of the process). Bi- 
ased sampling is then performed so as to generate paths 
which move through specific bins on A. This strategy of 
projecting the phase space of the system onto some sub- 
space of one or more dimensions; dividing the subspace 
with a set of partitions and then running short trajec- 
tory paths in or between compartments is common to 
FFS H, WE [ni, Milestoning and Boxed Molecular 
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Dynamics [ij] and has been very successful. Although 
these algorithms do not all require detailed balance and 
are successful for treating non-equilibrium steady states, 
application to general non-stationary dynamics is still ex- 
ploratory, and is so far only shown for WE [T^] (although 
the authors step back from actually claiming this, pre- 
ferring to state that the method covers a 'broad class of 
stochastic processes' rather than the full range of stochas- 
tic non-stationary dynamics). The requirement for sta- 
tionarity in all existing methods apart from WE arises 
from the assumption that microstates from a given com- 
partment can be treated interchangably at the compart- 
ment boundaries - regardless of, often importantly, the 
duration of the path which has led to a given state. 

We generalise this strategy of compartmentation to 
non-steady-state systems; in essence only by fixing the 
duration of each trajectory fragment (here called a 
'shot'), so that the time evolution can easily be tracked. 
The resulting method, once sampling and reweight- 
ing schemes have been developed around this central 
premise, is termed Stochastic Process Rare Event Sam- 
pling (S-PRES). The most important design choice is the 
procedure used to ensure dynamically adaptive sampling 
rates for the different bins. This is achieved here using a 
variant of Rosenbluth sampling, as is sometimes used in 
EES; rather than by moving the bins as has been investi- 
gated for WE. The choice to keep the bin positions fixed 
has the benefit of allowing a high-level and mathemati- 
cally friendly description of the dynamics to be developed 
online in the form of a time-dependent matrix of transi- 
tion frequencies between the bins. 

II. S-PRES: ALGORITHM DESCRIPTION 
A. Overview 

We define a scalar-valued coordinate A as a function 
over the state space of our system. This coordinate is 
discretized into bins labelled by an index i. (Note that 
the phase space of the system need neither be discretized 
nor finite. These conditions only need to hold for the co- 
ordinate binning.) As an example choice for A, one might 
use the number of particles in a liquid droplet forming in 
supersaturated vapour. 

The main goal of S-PRES is to observe a roughly con- 
stant number of forward transitions from each bin i on 
each interval [t,t + t]; where a forward transition from i 
at t is defined as any shot where the microstate at time 
t + T falls within a bin j > i. In this way unlikely transi- 
tions can be explored and sampled with high statistical 
accuracy. 

In principle, A could be a vector instead of a scalar. 
The restriction to a scalar is used here to simplify the 
discussion. If using a vector-valued A, the concept of 
'forward' becomes non-obvious. An example approach in 
two or more dimensions is to define a Hamming distance 
as the number of bin boundaries which remain to be be 



crossed in order to reach some target bin: 'forward' then 
describes any shot which decreases this Hamming dis- 
tance. 

B. Importance Sampling for Adaptivity 

We carry out a variable number n* of 'shots' (short dy- 
namics runs) of a fixed duration r from the configurations 
in each bin i at each time t. (Here and in the following 
we use the term "configuration" for a microstate of the 
system at a given time on a given path. Two paths can, 
in principle, reach identical microstates at the same time. 
In this case the algorithm would still hold two configura- 
tions.) 

As we are considering stochastic processes, shots from 
the same starting configuration can be made to diverge 
by varying the random number seed used to generate 
the dynamics, n* is adapted during the simulation to 
improve statistics. During the course of the simulation, 
the number of bins which are populated by configura- 
tions gradually increases (indicated by the triangles and 
squares in fig. [1]) until transitions are sampled at each 
timestep from all bins which have a non-zero occupa- 
tion probability. In order to achieve a roughly constant 
number of forward transitions from each bin i, we select 
nl based on the estimated transition probabilities at the 
previous timestep. The number of shots from bin i at 
time t + T is defined as: 

^r = K +7(^-1)^*1 (1) 

where N is the target number of forward transitions, 
7 is a damping factor, R is the number of shots which 
moved forward from bin i to bins over higher ranges of A 
on the step t — t to t, (or i? = 1 if this number is zero). 
The brackets [ ] indicate the ceiling function. This adap- 
tive sampling method makes S-PRES akin of the class of 
variational approaches to stead y-st ate importance sam- 
pling (IS) described elsewhere [l^. Selection of values 
for the parameters N and 7 is discussed in sec. Ill El 

C. Explanation of Sampling for Path Generation 

In order to enhance the exploration of rare states 
we apply a version of the pruned-enriched Rosenbluth 
method (PERM) ^Td\, i.e. when picking configurations 
from a given bin as starting points for new shots, we do 
not select them with equal probabilities, but with a vari- 
able statistical weight which depends on the sampling 
history. In conventional PERM paths at a given time- 
point are either discarded, or selected for exactly one or 
two copies to extend to the next timepoint, based on 
lower and upper thresholds in their weights. The im- 
plementation presented here avoids having to set these 
thresholds. The number of branches n* is shared out 
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FIG. 1. Schematic representation of path generation. 

Symbols indicate configurations, lines represent path seg- 
ments of duration r. 

between paths in each bin i randomly in proportion to 
the path weights. This has the same effect on average 
of discarding the relatively unlikely paths and sharing 
out the weights of the relatively more likely paths; such 
that statistics over a given bin are not dominated by a 
few highly weighted paths and also such that computa- 
tional effort is not wasted on highly unlikely paths which 
contribute almost nothing to the statistics. 

Here we give a non-mathematical introduction to the 
procedure (see fig. [U where symbols stand for configu- 
rations and lines indicate path segments) and then we 
motivate the method further in IIIDI 

Initially, configurations in each bin i with non-zero 
number of occupying configurations are selected with 
equal probability as starting points for the n° paths from 
that bin (in the example of fig.[T]with probability 1 for the 
single circle in bin 1 at time=0; and probability 1/2 for 
each of the two circles in bin 2 at time=0). Subsequently, 
we take into account from which bin i a configuration in 
j originates (the number of branches extending from the 
configurations in i determines the weight of their end- 
points in j.) In fig.IU the left triangle in bin 2 stems from 
a path with weight 1/2 (1 state divided by 2 branches), 
while the right triangle stems from a path with weight 1 
(2 states by 2 branches). Hence, when selecting starting 
configurations for new shots, the left triangle in bin 2 is 
chosen with half the probability of the right triangle. 



weights need to be taken into account when selecting 
starting configurations for new shots. We now provide 
a detailed explanation of the procedure to do this. 

We define P*(i) as the proportion of configurations in 
bin i at time t assuming infinitely many configurations. 
We call the estimate of P'^{i) from a finite number of 
configurations d*. Similarly, we define P*{j\i) as the pro- 
portion of pathways from bin i which end in j in the case 
of infinitely many trajectories, and its estimate as M*^ . 

As in the example of fig. [1] we begin at time t — hy 
picking configurations in bin i with equal probability as 
starting configurations for shots. At time t = t we count 
N[j pathways that went from i to j. Each of these path- 
ways has an equal weight P'^{i,j) /N^j , because they each 
had the same chance to be selected for shots from bin i. 
For the next step, we would like to pick configurations 
from bin j such that the probability of a configuration 
sj being picked is proportional to the weight of its path 
P(select : sj) cx P"^ {i, j)/N[j. To conveniently compute 
this we normalize by P{j) and write: 



P(select : sj) = { )P^^,J)/P^J) 

P(select : sj) = {J-)P^J\^)P^^)/Y^p-{J\^')P^{^') 
hi ^> 

During the course of the simulation we do not know the 
values of P^{j\i) and P'^{i)- However, we do have the es- 
timates Mj^j and dl. As we sample within a bin without 
bias, the errors in (i[ (and MJ relative to P'^{i) (and 
P'^{j\i)) are zero-mean. Therefore all products and ratios 
of different errors are also zero-mean. And hence the laws 
of conditional probability can be applied to the estimated 
probabilities without introducing any bias. Therefore an 
estimate of the optimal P(select : sj) can be defined as 
a function of M[j and dj , without introducing any bias: 

_ Ml^dJ 



D. Motivation of Sampling Strategy 

We bias the sampling of trajectories towards rare 
events by adapting the number of shots n* from a bin 
i at a time t such that sufficient statistics are produced 
for rare transitions (eqn. [T]). However, we do not bias 
within a bin: when we select a sample of n* configura- 
tions from the path endpoints in a given bin i as starting 
points for new shots, we do not apply any bias. 

Selecting configurations without bias does not imply 
that they are drawn with equal probabilities. On the 
contrary, as the total number of shots n* varies betwen 
bins, pathways that arrive in a bin j from different bins 
i,j,k... have different statistical weights (according to 
the respective values of n*""^, ri^~^ . . . ). These 



Selecting configurations for shots using P(select) in- 
stead of P(select) is perfectly acceptable: in a large 
number of repeated experiments each configuration will 
be selected a number of times proportional to its true 
P(select), and correct average properties will be ob- 
served. 

The central trick of the algorithm is that although the 
trajectories which enter a given bin do not have an equal 
statistical weight; those which leave a given bin do have 
an equal statistical weight, because their selection for 
shots is determined by an unbiased estimate of P(select). 
For this reason, the selection formula ([2]) can be applied 
at every timestep without explicitly considering the his- 
tories of trajectories more than one timestep into the 
past. 
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E. Algorithm Parameters 

The duration of path segments t, the 'damping fac- 
tor' 7 (from eqn. [T|) and the target number of forward 
transitions per bin N must be set by the practitioner. 
For the method to be efficient, r should be shorter than 
the shortest passage time of the rare event and longer 
than the timescale necessary to move from one A-bin to 
another. If the dynamics can vary such that the time 
required to transit between A-bins is often longer than t, 
then a small value of 7 (say between 0.01 and 0.5) can 
improve stability by stopping the adaptive sampling from 
attempting too many shots before it is possible that even 
one of them moves forward. On the other hand, in the 
case that t should come near to the timescale over which 
the dynamics can evolve globally with respect to time 
then 7 should be set close to 1, allowing the sampling 
rate to adjust rapidly. 

It is possible to dynamically vary both 7 and t if 
needed; for instance in a glassy material r can be in- 
creased as the system arrests. If the system has a com- 
plex configurational space, with multiple substantially 
different reaction pathways all projecting onto the same 
values of A, then 'dead-end' states may sometimes be 
reached, where rii grows out of control even for small 7 
and large r. In this case, if a better A is not available, 
a threshold should be defined for the largest acceptable 
Ui given practical computational constraints. N should 
be set large enough such that it is larger than the typical 
fluctuations in n^. 



F. Extraction of Observables and Statistics 

In order to compute expectation values of observables, 
we consider time slices throught the set of pathways. A 
configuration s from the set of configurations at time t 
is associated with a microscopic weight w*. All config- 
urations at the same timepoint in an S-PRES run have 
had the same history of external control parameters. The 
microscopic weight of a configuration s is given by: 

wl = P(select : s* )d* 



using the same reasoning as eqn. [5] w* is proportional 
to the estimated probability of occurrence of the config- 
uration at the same timepoint in a repeated experiment 
with the same control parameter history and distribution 
of initial configurations (but, obviously, with different in- 
dividual trajectories due to the stochastic nature of the 
dynamics). 

In order to extract the expected value of some observ- 
able it is required to take an average over all config- 
urations s at time t. The sum of the weights over all 



configurations is 1. Then the time slice average is given 
by: 

x* = 5]«;>(s*) (4) 

This sample average will converge with a sufficiently 
large number of configurations. However, estimates of 
fluctuations require some care because the different con- 
flgurations held at a given time are likely to have some de- 
gree of mutual information, having previously branched 
from a single parent or grandparent conflguration, lead- 
ing to potentially dramatic underestimates of the vari- 
ance. The simplest solution to this problem is to run 
two completely independent calculations from different 
sets of starting configurations, calculating the variance 
at t in the second run based on the estimated mean at t 
from the first run (and vice- versa) . Estimates of the error 
(as distinct from fiuctuation) either in direct observables 
or in variances can then be achieved by making further 
independent runs. 

G. Macroscopic Description of Time-Evolution 

In order to estimate the progress of the algorithm on- 
line, we consider the system dynamics in terms of the 
macroscopic coordinate A. These dynamics will in gen- 
eral not be Markovian. For the following discussion, 
we borrow (with apologies) some mathematical notation 
from the language of Markov processes, but we do not 
imply that the dynamics in terms of A are Markovian. 

We consider a time series of 'macrostate vectors' (? of 
the estimated occupation probabilities of the bins, begin- 
ning from the initial distribution which has been chosen 
and progressing such that: = ((i**)"'"M*. 

We can define the entries of d* and M* just like any 
other observables, as the sum of the associated weights 
(as from eqn. ^ 

4 = E-*^* (5) 

Ml = Y,w\5l'^5]/<fr (6) 

W} 

where (5* is 1 if the path occupies bin i at time oth- 
erwise 0. 

In the case of a Markov process, a marginalisation 
is carried out over 'parent' bins i such that = 
(i*~^M*~'^. Because we have time- varying control pa- 
rameters, and because we do not make the assumption 
of memorylessness, this marginalisation acquires caveats. 
Obviously, we must bear in mind that the estimated occu- 
pation probabilities depend on the history of control pa- 
rameters; and that they are by definition time-dependent. 
Less obviously, it does not hold that the marginalisation 
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over the 'parent' bins i can be carried out without loss of 
information as in the case of Markov dynamics, meaning 
that our MI~'^ represents an average over the configu- 
rations in i, rather than having the same value for each 
configuration in i. In particular, the practitioner should 
be aware that non-trivial correlations of the process on a 
time-scale longer than r are not captured by the descrip- 
tion in terms of the matrices M*. 

The extent to which a given M* can be transplanted 
to a different timepoint t or to describe a system which 
was initialised with different starting conditions, or with 
a different history of control parameters, must be judged 
by the practitioner. If the series of M* is stable for suc- 
cessive timepoints then this indicates that the dynamics, 
at least with respect to A, may have converged to some 
stationary limit. Under the assumption of stationarity in 
the dynamics, diagonalisation of M* to find the infinite- 
time distribution with respect to A can usefully be carried 
out, as well as other tactics commonly used to condense 
a description of the kinetics from a Markov-like matrix 
of estimated transition probabilities ■ 



H. S-PRES Algorithm Pseudocode 

To aid implementation a step-by-step guide to an ex- 
ample program is provided. 

The set of starting configurations can be prepared in 
any way that is of interest. E.g. one could prepare an 
equilibrium set for a given control parameter and then 
use S-PRES to perform a quench, i.e. change the param- 
eter and observe the system dynamics. Or one could start 
out from a single configuration and observe how trajec- 
tories diverge from this point. If the system is intended 
to be set up in a stationary state then a conventional 
rare event sampling method or an initialising round of S- 
PRES can be run for whatever length of time is needed 
to prepare a set of configurations with correct associated 
weights over a good range of A. 

1. Prepare a set of (one or more) configurations of the 
system at time i — 0. 

2. Find A for each configuration, and associate it to 
the appropriate discrete bin w.r.t. A. 

3. Prepare a vector giving the estimated occupa- 
tion probability of each bin on A at t = 0. 

4. Prepare a vector fiP giving the number of shots to 
make from each bin at t = 0. 

5. Loop for t = Q to t = oo: 

(a) Set all transition probability estimates Ml ^ — 
0. 

(b) Loop for all i s.t. c?,- ^ 0: 

i. Repeat n\ times: 



A. Select a configuration in the bin i. At 
t > use eqn. [2] (with a shift of index) 
to weight the configurations according 
to their previous bin. At t = set all 
configurations in i as equiprobable. 

B. Run dynamics of duration t. 

C. Calculate A for the evolved configura- 
tion and find bin j given A. 

D. Associate the evolved configuration 
to bin i for the next timestep, also 
recording its origin as i. 

E. Set M*^- = M*^- + 1/n*. 

ii. Set sampling rate n/"*"^ using eqn. [1] 

(c) Print the matrix M* and the vector d*. 

(d) Print any further observables derived using 
eqn. S) 

(e) Set = M*f?. 

(f) Set f = < + r. 

I. Boundary Conditions for Flux Calculations 

S-PRES can be used in two ways, either to calcu- 
late the time dependent probability distribution of some 
static observable or to calculate the time dependent re- 
action flux (j)(t) between two specifically chosen 'source' 
and 'sink' bins on A (which in the following we call A 
and B). The latter quantity is the non-stationary ana- 
logue of that which is usually calculated using TPS and 
FFS methods, and the former of that which is usually 
calculated via IS techniques. Flux calculations typically 
require special treatment of boundary conditions, in or- 
der to remove the effects of granularity in time and in 
order to create a system which can remain far from equi- 
librium indefinitely. 

In order to achieve a definition of the forward flux 
which is strictly independent of r it is necessary that runs 
which enter the 'sink' region, B, are halted immediately. 
This may be computationally costly if the coordinate A is 
costly to calculate, but cannot be avoided if an accurate 
flux is desired. If paths were allowed to enter and leave 
B, this would not correspond to the accepted definition 
of forward reaction fiux as the probability per unit time 
of a first passage from AtoB. 

The region B is treated as absorbing in this way: at 
the end of each timestep, the probability vector entry 
corresponding to B is set equal to zero and the entry 
corresponding to the region A is incremented by the flux 
which has been deleted. No new configurations are actu- 
ally transferred to A, only some of the 'probability mass' 
tracked by d*. This 'short circuit' of the matrix is equiv- 
alent to a system with an infinite reservoir of states in 
A and absorbing boundary conditions at _B; which is the 
premise normally adopted for FFS. 
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A second restriction, which should not in general be 
used but which was imposed on the FFS-like "flux" vari- 
ant of the method for the calculations carried out here, 
is to instantly terminate any paths which return to A; 
and then re-initialise them with a random configuration 
from within A. This setup describes a slightly unphysical 
situation, but was required here to achieve exact corre- 
spondence of the definition of flux with existing steady- 
state FFS calculations [ll| , such that only paths from A 
to B which make the journey in a single pass without 
any return to A are considered. 

If it is preferred to calculate the time series of the state 
vector and the matrices M* (or some other extracted 
observable) without specific definition of a flux then no 
special bins A or B are defined. 



J. Non-Requirement for Poisson Statistics 

The assumption of Poisson statistics; that rare events 
occur independently and without correlation; is required 
by most existing methods [19|. This assumption may be 
an unwelcome limitation and is not required by S-PRES. 
(Although in the example of (sec. IIII[) boundary condi- 
tions were set up so as to force Poisson behaviour). The 
weighted directed acyclic graph (WDAG) of configura- 
tions which S-PRES generates can be used to measure 
the deviation from Poisson statistics. Define P{Xt'\Xt) 
as the estimated conditional probability of event X at 
time t' > t given that the system also experienced X 
at time t. This is measured by performing a sum over 
the weights of configurations at t' , counting only that set 
which are descended directly from configurations which 
experienced X at time t, {w;}, and a second sum over 
only those which experienced the event at both times, 

p(xHxo==5]W}/E{^'} (7) 

i i 

In this example use of the WDAG, the size of the set of 
joint events {w[} may in practice be so small as to cause 
sampling errors unless the deviation from Poisson statis- 
tics is large or a large calculation is carried out. Storage 
of the entire WDAG is likely to be cumbersome for many 
practical applications in which the storage capacity to 
describe every configuration will quickly mount up. 



III. APPLICATION TO RARE EVENTS IN THE 
ISING MODEL - GLAUBER-ISING UNDER 
IMPOSED SHEAR FLOW 

A. Introduction to System 

The example of nucleation in the 2D Ising model on a 
square lattice under imposed shear flow is a case that has 



been studied (although only for constant shear rate) us- 
ing FFS [13, [3 ; allowing for direct comparison of our re- 
sults with those from an established method. This sirnple 
model exhibits rich non-equilibrium phase behaviour 20] , 
however in this instance it is employed only to demon- 
strate the use of the sampling algorithm. 

The system was set up as follows (duplicating the FFS 
studies): Glauber dynamics were used to evolve the spins 
at each lattice site, meaning that at each 'sweep' L x L 
sites were chosen randomly to have their spins reassigned 
according to a Boltzmann- weighted probability. The sys- 
tem was prepared with all (65 x 65) spins down. A weak 
upward external field was applied, rendering the prepared 
state metastable relative to the stable state in which all 
or most spins are up. Shear flow can either accelerate 
or retard the formation of a nucleus of up-spins and the 
transition to the stable state, depending on the shear flow 
rate [l3|. 

Shear flow was applied L times at each sweep by ran- 
domly selecting one of the L horizontal lines between 
rows of lattice sites and applying a move with probabil- 
ity 7 such that all sites above it are translated one space 
to the left, with periodic wrapping such that the spin at 
i = 1 becomes the spin at i = L; this is a simple model 
of inflnite two-dimensional laminar Couette flow. 

A subtlety enters in the treatment of the vertical peri- 
odic boundaries (the interaction between sites with j = I 
and j = L): in order to avoid shearing along this line un- 
less it has been explicitly selected, an offset pointer is 
maintained so that even after the spins in the row with 
j ~ I have been moved (after a shear at say, the bound- 
ary between j = 5 and j = 6) they remain in contact via 
periodic imaging with the same spins in the row j = N 
as they were before. 

A long discussion of the detailed implementation of 
this model system is available ^SJ. The notation 7 to 
indicate the rate of the imposed shear flow is used here 
for consistency with this earlier work and has no relation 
to the 7 of eqn.[TJ which is used to indicate the 'damping' 
constant applied to stabilise sampling rates. 

In order to generalise the steady-state model having 
constant shear flow to a simple and directly comparable 
non-steady-state case we subjected the system to a time- 
series of three different shear flow rates within the low- 
shear (nucleation-enhancing) regime, allowing the nucle- 
ation rate to relax to the steady-state value after each 
change of shear flow rate. In order to produce directly 
comparable data to the FFS studies, the time dependent 
forward fiux for the system was calculated using the same 
system parameters described for the steady state calcu- 
lations in refs. [13, 13; which is to say size L = 65, cou- 
pling constant J — 0.65fcsr and external field strength 
h — O.OSfcsT. At 7 = the system can be considered 
to be in a state of quasi- equilibrium where a meta-stable 
basin and stable basin are separated by a large free en- 
ergy barrier. Classical Nucleation Theory (CNT) gives 
the size of the barrier to nucleation as w 22kBT for 
this regime signifying that a rare-event technique 
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is strongly recommended to extract meaningful statistics 
by simulation. 



B. Definition of Coordinate Bins for Sliear-Ising 
Calculation 

The coordinate over configurations was defined simply 
as the total number of up spins present, A — Nup. The 
source bin, A, was defined as A < 25 and the sink bin B 
was defined as A > 2005 (as in the FFS calculation fl8l|). 
The intervening space on A was divided into 990 equal 
increments. It might have been possible to make a more 
sympathetic definition of the intervening bins, such as 
by spacing them more closely together for smaller values 
of A where the dynamics on A is expected to be slow, 
however this crude binning was found to be effective. 

The sampling parameters 7 = 0.5, r = 10 and N = 100 
were used. Occasional 'dead-end' configurations mani- 
fested, where A was large due to multiple isolated clusters 
of spins rather than due to a single nucleus: a maximum 
Tii threshold of 2000 was therefore set, in order to prevent 
eqn. [T]from diverging due to these instances. 



C. Results for Shear- Ising Calculation 

It was necessary to run the calculation for 2500 MC 
sweeps at the initial shear rate 7 = 0.04 before all bins 
of the coordinate A were populated, allowing meaningful 
statistics to be collected. Fig. [5] shows the fiux against 
time as the shear rates were changed (7 = 0.04, 0.02, 
and 0.0). Horizontal lines (black) indicate steady state 
FFS data from a separate research group [l3|; the trace 
(red online) is the S-PRES results. After each change of 
shear rate the time-dependent flux relaxes to the known 
steady state value (actually the quasi-equilibrium value 
in the case 7 — 0.0), validating the method. The trace is 
an average over 100 independent runs. 



IV. APPLICATION TO RARE EVENTS IN THE 
ISING MODEL - KAWASAKI-ISING AFTER A 
QUENCH 

A. Introduction to System 
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FIG. 2. Example use of S-PRES: Nucleation in the 
2D Ising model under shear. Solid horizontal lines: ref- 
erence steady state nucleation rates for each value of imposed 
shear 10] (dashed lines show the reported errorbars). Fluc- 
tuating trace (red online): S-PRES time-series as the shear is 
changed. 



it is difficult to predict their rates of collision and growth 
or shrinkage and the evolution of the size distribution 
of clusters over time. Despite these difficulties a theory 
based on the iterative evolution of a population vector 
of clusters of different sizes, p„(i) is available from the 
literature (22| . which has not until now received direct 
validation from simulation studies (although a closely re- 
lated approach has had the benefit of such scrutiny 23] ) . 
The existence of an untested theory for such a simple but 
important model system is an ideal opportunity to fur- 
ther demonstrate the S-PRES method while at the same 
time making a small contribution to the basic study of 
phase-change dynamics. 

The equilibrium thermodynamics of this model are 
well understood, as are the phase-change dynamics in 
both the nucleation-dominated (surface-energy limited) 
and spinodal decomposition-dominated (diffusion lim- 
ited) regimes [l^]- We carried out an instantaneous 
quench from T = 00 to T — 0.6Tc, which lies between 
these two regimes, for a system of 100 x 100 spins with 
a 0.1 concentration of up-spins. The coupling constant 
J was set to IksT. T ~ O.OTc was set to 1.36151, using 
the Onsager resuh of 2//n(l + ^2) (Hj. 



Phase separation in the 2D Ising model af- 
ter a quench into the temperature region between 
the nucleation-dominated and spinodal decomposition- 
dominated regimes is a quintessential problem in non- 
equilbirium dynamics. Under Kawasaki dynamics (some- 
times called a 'lattice gas') the total magnetisation is 
conserved and time-evolution is controlled by diffusion 
of spins. The base timescale of the system is set by one 
MC sweep, equal to Nup attempts to move a random up- 
spin. Because the diffusive and evaporative behaviour of 
spin clusters is determined by both their size and shape 



B. Definition of Coordinate Bins for 
Kawasaki-Ising after a Quench 

As the coordinate we chose A = X^cl^-c ~ 1)' where ric 
is the number of spins in cluster c and a cluster is defined 
as a connected group of up spins. This coordinate was 
chosen because it is simple to calculate and has a value 
of zero when all up spins are isolated, increasing after 
any collision between spins or clusters. The lowest bin 
was defined as A < 9 and the highest bin was defined as 
A > 91, with the intervening integer values assigned to a 
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single bin each. 

The sampHng parameters 7 ~ 0.5, t ~ 10 and N = 100 
were used. 



C. Results for Kawasaki-Ising Calculation 

In order to prepare initial states including unusually 
large clusters the system was prepared in the T — 00 
regime using some few hundreds of iterations of S-PRES 
in order to achieve statistics down into the range p„ = 
10^^^ before applying the quench. 

The probability distribution of cluster sizes p7i is not 
directly available from the probability distribution of re- 
action coordinate bins d; instead it was required to cal- 
culate it as an average over all configurations generated 
at each timestep, weighted according to eqn. H) In fig. [3] 
we show S-PRES results for p^. There is good qualita- 
tive agreement with the theory, which is shown in fig. [3]- 
inset. The results shown are averaged over 10 indepen- 
dent calculations; error bars are the estimated standard 
errors over the 10 values. Brute force calculation in the 
T = 00 regime is very cheap due to the lack of interac- 
tions between spins at this temperature, therefore data 
at t = 0, T = 00 from a brute force calculation separate 
to the S-PRES calculation is also shown in fig. [3] (and 
also the standard result p„ = e~"/^ 123)- The exist- 
ing quantitative results for the infinite temperature case 
highlight a potential source of problems caused by the er- 
ror behaviour of S-PRES - until convergence is achieved, 
probabilities of rare states are reported as zero; meaning 
that S-PRES will converge on the correct values from 
below. 

To comprehensively explore the applicability of the 
Mirold-Binder theory is not the aim of this work, and 
would require further data over a wide range of temper- 
atures and concentrations, however to provide numerical 
results for systems previously accessible only to theory 
is an example of the type of research into phase change 
dynamics which can be carried out using S-PRES. 
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APPLICATION TO RARE EVENTS IN A 
TIME-DEPENDENT ASYMMETRIC 
EXCLUSION PROCESS 



FIG. 3. (a) Example use of S-PRES: Temperature 
quench of the 2D Ising model with conserved order 
parameter. Calculated time evolution of the domain-size 
distribution is compared with theory [S^l (inset). The the- 
ory is well outside the error bars (which are invisible except 
for very small pn), but does provides qualitative agreement 
in so far as reproducing the shapes of the four curves, (b) 
Special case t = 0, T = 00. At T = 00 a brute-force calcula- 
tion is easy, so is superimposed on the S-PRES data down to 
p„ X 10~*. The standard result p{n) — e""''^ is also shown. 
Colour online. 



up at the right boundary of the system, forming a block 
with density (1 — /3); and the remainder of the system has 
fast-moving traffic with density a. The phase boundary 
moves stochastically in the critical state according to a 
random walk. 



A. Introduction to System 

An Asymmetric Exclusion Process (ASEP) is a sim- 
ple model for driven stochastic transport. Here we dis- 
cuss the "parallel-open" (p-o) ASEP, as has been charac- 
terised by Schiitz ^2l\ . In this model, particles are intro- 
duced at the origin with a probability a at each sweep; 
and removed from the right boundary with probability /3. 
Between the two boundaries, a deterministic update rule 
allows particles to move from left to right providing that 
a vacancy exists. When a = /3, the system becomes crit- 
ical, with a divergent correlation length. Particles queue 



B. Simulation Setup 

In order to validate S-PRES against the quite tractable 
time-dependent properties of the ASEP, the system was 
initialised without any particles; and allowed to gradually 
approach the steady state, in analogy to the morning 
traffic along a busy road. The length L was set as 500 
sites and the parameters a and /3 were both set as 0.01. 
12,000 brute force calculations were run, each of duration 
10^ sweeps. A single S-PRES calculation was also set up, 
with A defined as the number of particles, divided over 
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FIG. 4. Probability that the ASEP is completely full, 

given that it is empty at t = 0. Tlie S-PRES calculation 
agrees with theory over 46 decades. Colour online. 



100 equal-sized bins. The S-PRES parameters r — 500 
and 7 = 0.5 were used. 



C. Results for ASEP Calculation 

The rare event in this case is the fuh occupation of 
the system, i.e. the particle number equalhng the num- 
ber of sites. Statistics were collected from the S-PRES 
and brute force runs on the probabihty of the rare event 
P( full). This is available in the steady-state limit from 
[271 as P{fuU)t^oo = 2(1 - a)^/^/L. Assuming that 
the phase boundary moves as a random walk starting 
from the origin at t = 0, and that the density of the 
dense phase is constant; then the time-dependent value 
of P{full) is readily available by numerically iterating 
Pick's first law, beginning with the probability density 
defined as 1.0 at the origin and elsewhere. 

The S-PRES and brute force calculations converged to 
the steady-state limit and the S-PRES calculation was 
also able to confirm that kinetic properties were very ac- 
curately predicted by the assumption of Fickian diffusion 
of the phase boundary. This is shown in fig. 2] 



VI. EXAMPLE USE OF THE MATRICES M* 

The structure of M with respect to time can be anal- 
ysed in order to estimate the usefulness of the coordinate 
projection which has been employed and to pursue in- 
sight into the mechanics of the system under considera- 
tion. An example is the extraction of committor proba- 
bilities. 

The 'committor probability' pb{s) or the ultimate 
probability that a given configuration s will complete the 
reaction before returning to some initial state is a quan- 
tity generally of interest in the analysis of rare events. 
In a non-stationary system this value can change with 
respect to time. An easy estimate of committor proba- 
bilities at a given time to can be achieved by using the 
time series of M in the following procedure: 

1. Create two 'sink' bins A and B s.t. Vt > to: M\ ^ = 
1, M*,^5 = 1, MX.^A = 0, - 0. 

2. Repeat for each bin b ^ {A, B} 

(a) Initialise a vector d s.t. di, = 1 and di = 

(b) Apply the time series of modified M* to each 
d, beginning aX t ~ to until di « Vi ^ 
{A,B}. 

(c) ds now holds the expected value of pb(s) over 
the bin b. 

The procedure above gives the committor only with re- 
spect to the bins on the projected coordinate A; the main 
purpose of such an analysis is to evaluate the usefulness 
of the particular definition of A. It is considered that the 
closest possible identity between A and the committor 
probability gives the most efficiently enhanced sampling 
for rare event methods [l^. If A does not determine 
Pb or if the granularity of the binning is large near to 
sharp changes in pB, then S-PRES becomes less useful 
and alternative strategies might be required. If an initial 
rough calculation can be made to work, then it is possible 
to record online (without the assumption of mixing) the 
mapping between A and some different observable using 
eqn. 3) In the case that large computer memory is avail- 
able then the entire WDAG of configurations connected 
by path segments can be stored; allowing formal methods 
of projection onto subspaces of manageable dimensional- 
ity [23, [so] to be experimented with offline. 

Fig. [5] shows the committor probability distributions 
for three timepoints in the evolution of the variable-shear 
system of fsec. lIII[) corresponding to three different shear- 
rates. The shifting of the distribution to the right for 
lower shears is consistent with the smaller reaction flux 
observed. This effect is due to the fact that the fre- 
quency of cluster collisions decreases with decreasing 7 
more quickly than does the "cluster evaporation" rate. 

Further statistics of interest for a typical system might 
include the transmission coefflcient k or the width of the 
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FIG. 5. Committor probabilities for 2D-Ising under 

shear, assuming that the system is initialised in bin i at each 
time t = 3000; 4000; 5000 (corresponding to 7 = 0.04, 0.02, 
and 0.0). The committor psii) moves to higher bins for later 
timesteps (lower shears), which is consistent with the lower 
nucleation rates observed. Colour online. 
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FIG. 6. Scaling of algorithm efficiency: performance is 
more robust to excessive numbers of phase space bins than it 
is to insufficient numbers of bins. As the probability of the 
rare event in consideration decreases (smaller (j>) the speedup 
becomes proportionally more. Each trace is an average over 
10 independent calculations. Colour online. 



committor distribution, as discussed for the equilibrium 
3D Ising model in ref. [31,]. These statistics can be cal- 
culated with respect to A as above; or with respect to an 
arbitrary variable by using the WD AG. 

VII. SCALING AND CHOICE OF BINNING 
A. Scaling with Relation to Fineness of Binning 

If the S-PRES calculation is set up in order to find rare 
states, then it has two phases. In the first phase ('popu- 
lation') the goal is to achieve a state whereby one or more 
configurations are associated with each bin, allowing rare 
events to be observed. In the second phase ('observa- 
tion') the goal is to continue the dynamics and observe 
the time-evolving behaviour. To make a loose scaling 
argument from equilibrium statistical mechanics, if we 
assume that the number of bins Nb is large enough that 
no significant free energy barriers exist between adjacent 
bins, but that only one new bin is populated at each 
iteration of the algorithm, then the total time required 
for the population phase should be roughly proportional 
to N]^/2. During the observation phase, the time re- 
quired per iteration should continue to be linear with 
the number of bins. Therefore the population phase is 
considered as the performance bottleneck of the method 
and 'speedup' is defined as the expected number of MC 
sweeps needed to observe the first rare event using brute 
force (equal to L"^ /4> for the shear- Ising example system) 
divided by the expected number of MC sweeps needed to 
observe the first rare event using S-PRES, which marks 
the completion of the population phase. 

We repeated the shear calculations of (sec. IIIII) at con- 
stant 7 = 0.04, 0.02 and 0.0 and with L = 50; for various 



numbers of equally spaced bins, even for small numbers 
of bins where the assumption of closely spaced bins no 
longer holds. Fig. [S] shows speedup versus the number 
of bins for calculations at the three different shear rates. 
At the shear rates for which nucleation is more rare, the 
speedup is proportionally greater. The speedup was ro- 
bust to the use of excessive numbers of bins. The smaller 
system of L = 50 was chosen for the benchmarking be- 
cause it is computationally less expensive; reaction fluxes 
were roughly equal to those observed for the larger sys- 
tem, with (/) = 1.5 X 10-12,0.8 X lO'^^ and 0.3 x lO'^^ 
for the three shear rates. 



B. Robustness to Non-Monotonic Binning 

A common thought experiment used to test 
coordinate-based rare event methods such as EPS, 
umbrella sampling or S-PRES is to imagine a system for 
which the the projected coordinate A is non-monotonic 
or sometimes ortho gon al with respect to the true reac- 
tion coordinate (lol |32|. A practical example of this is 
protein folding, where even simple proteins and peptides 
can move through sequences of transition states which 
are dissimilar to each other and to both the unfolded 
and folded conformations [1^ [13] , making it difficult to 
define a useful projection of the progress of the reaction 
without detailed prior knowledge of the the folding 
mechanism. 

S-PRES is robust to this situation in the sense that 
S-shaped trajectories can be developed by the algorithm 
because paths which move backward as well as forward 
in A are generated and stored; however it is still better to 
choose coordinate projections which are near-monotonic 
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with respect to the real progress of the reaction because 
any bins which represent muhiple stages of the 'true' re- 
action coordinate will require larger populations in order 
to give stable sampling; which will need to be crudely 
dealt with by setting a small 7 and large N in eqn. [TJ 

VIII. CONCLUDING REMARKS 

This paper presents a method, S-PRES, to investigate 
rare events in non-equilbrium and non-steady-state dy- 
namics. S-PRES can compute the evolution of the prob- 
abilities of rare events or rare states in any stochastic 
system with respect to time, providing that a suitable 
binning on the phase space can be defined. The method 
is based on Forward Flux Sampling with modifications to 
permit tracking of the ages of the configurations sampled. 
A version of the pruned-enriched Rosenbluth method is 
applied to the generation of path segments in order to 
achieve efficient sampling. 

To demonstrate the method we calculated phase 



change kinetics in the Ising model both under shear and 
after a temperature quench; confirming existing results 
from theory and simulation. We also confirmed theo- 
retical results for the time-dependent critical behaviour 
of a model of driven diffusive transport. We anticipate 
that the method is useful for a very wide range of time- 
evolving processes in nature. Possibilities include the 
probabilities of abnormal cell differentiation during em- 
bryogenesis, fracture nucleation in materials under im- 
pact or time-varying load, and nucleation in glassy ma- 
terials approaching dynamic arrest. 
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